NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Differentiable GPU-Parallelized Task and Motion Planning

Shen, William; Garrett, Caelan; Kumar, Nishanth; Goyal, Ankit; Hermans, Tucker; Lozano-Perez, Tomas; Ramos, Fabio (June 2025, Robotics science and systems)

Planning long-horizon robot manipulation requires making discrete decisions about which objects to interact with and continuous decisions about how to interact with them. A robot planner must select grasps, placements, and motions that are feasible and safe. This class of problems falls under Task and Motion Planning (TAMP) and poses significant computational challenges in terms of algorithm runtime and solution quality, particularly when the solution space is highly constrained. To address these challenges, we propose a new bilevel TAMP algorithm that leverages GPU parallelism to efficiently explore thousands of candidate continuous solutions simultaneously. Our approach uses GPU parallelism to sample an initial batch of solution seeds for a plan skeleton and to apply differentiable optimization on this batch to satisfy plan constraints and minimize solution cost with respect to soft objectives. We demonstrate that our algorithm can effectively solve highly constrained problems with non-convex constraints in just seconds, substantially outperforming serial TAMP approaches, and validate our approach on multiple realworld robots.
more » « less
Free, publicly-accessible full text available June 21, 2026
Guiding long-horizon task and motion planning with vision language models.

Yang, Zhutian; Garrett, Caelan; Kumar, Nishanth; Fox, Dieter; Lozano-Perez, Tomas; Kaelbling, Leslie (June 2025, Proceedings IEEE International Conference on Robotics and Automation)

ision-Language Models (VLM) can generate plausible high-level plans when prompted with a goal, the context, an image of the scene, and any planning constraints. However, there is no guarantee that the predicted actions are geometrically and kinematically feasible for a particular robot embodiment. As a result, many prerequisite steps such as opening drawers to access objects are often omitted in their plans. Robot task and motion planners can generate motion trajectories that respect the geometric feasibility of actions and insert physically necessary actions, but do not scale to everyday problems that require common-sense knowledge and involve large state spaces comprised of many variables. We propose VLM-TAMP, a hierarchical planning algorithm that leverages a VLM to generate goth semantically-meaningful and horizon-reducing intermediate subgoals that guide a task and motion planner. When a subgoal or action cannot be refined, the VLM is queried again for replanning. We evaluate VLMTAMP on kitchen tasks where a robot must accomplish cooking goals that require performing 30-50 actions in sequence and interacting with up to 21 objects. VLM-TAMP substantially outperforms baselines that rigidly and independently execute VLM-generated action sequences, both in terms of success rates (50 to 100% versus 0%) and average task completion percentage (72 to 100% versus 15 to 45%).
more » « less
Free, publicly-accessible full text available June 2, 2026
NOD-TAMP: Generalizable Long-Horizon Planningwith Neural Object Descriptors

Cheng, Shuo; Garrett, Caelan Reed; Mandlekar, Ajay; Xu, Danfei (November 2024, Proceedings of The 8th Conference on Robot Learning)

Full Text Available
DiMSam: Diffusion Models as Samplers for Task and Motion Planning under Partial Observability

Fang, Xiaolin; Garrett, Caelan; Eppner, Clemens; Lozano-Perez, Tomas; Kaelbling, Leslie; Fox, Dietwr (October 2024, IEEE/RSJ International Conference on Intelligent Robots and Systems)

Generative models such as diffusion models, excel at capturing high-dimensional distributions with diverse input modalities, e.g. robot trajectories, but are less effective at multistep constraint reasoning. Task and Motion Planning (TAMP) approaches are suited for planning multi-step autonomous robot manipulation. However, it can be difficult to apply them to domains where the environment and its dynamics are not fully known. We propose to overcome these limitations by composing diffusion models using a TAMP system. We use the learned components for constraints and samplers that are difficult to engineer in the planning model, and use a TAMP solver to search for the task plan with constraint-satisfying action parameter values. To tractably make predictions for unseen objects in the environment, we define the learned samplers and TAMP operators on learned latent embedding of changing object states. We evaluate our approach in a simulated articulated object manipulation domain and show how the combination of classical TAMP, generative modeling, and latent embedding enables multi-step constraint-based reasoning. We also apply the learned sampler in the real world.
more » « less
Full Text Available
Sequence-Based Plan Feasibility Prediction for Efficient Task and Motion Planning

Yang, Zhutian; Garrett, Caelan; Lozano-Perez, Tomas; Kaelbling, Leslie; Fox, Dieter (January 2023, Robotics science and systems)

We present a learning-enabled Task and Motion Planning (TAMP) algorithm for solving mobile manipulation problems in environments with many articulated and movable obstacles. Our idea is to bias the search procedure of a traditional TAMP planner with a learned plan feasibility predictor. The core of our algorithm is PIGINet, a novel Transformer-based learning method that takes in a task plan, the goal, and the initial state, and predicts the probability of finding motion trajectories associated with the task plan. We integrate PIGINet within a TAMP planner that generates a diverse set of high-level task plans, sorts them by their predicted likelihood of feasibility, and refines them in that order. We evaluate the runtime of our TAMP algorithm on seven families of kitchen rearrangement problems, comparing its performance to that of non-learning baselines. Our experiments show that PIGINet substantially improves planning efficiency, cutting down runtime by 80\% on problems with small state spaces and 10\%-50\% on larger ones, after being trained on only 150-600 problems. Finally, it also achieves zero-shot generalization to problems with unseen object categories thanks to its visual encoding of objects.
more » « less
Full Text Available
Learning compositional models of robot skills for task and motion planning

https://doi.org/10.1177/02783649211004615

Wang, Zi; Garrett, Caelan Reed; Kaelbling, Leslie Pack; Lozano-Pérez, Tomás (June 2021, The International Journal of Robotics Research)
null (Ed.)
The objective of this work is to augment the basic abilities of a robot by learning to use sensorimotor primitives to solve complex long-horizon manipulation problems. This requires flexible generative planning that can combine primitive abilities in novel combinations and, thus, generalize across a wide variety of problems. In order to plan with primitive actions, we must have models of the actions: under what circumstances will executing this primitive successfully achieve some particular effect in the world? We use, and develop novel improvements to, state-of-the-art methods for active learning and sampling. We use Gaussian process methods for learning the constraints on skill effectiveness from small numbers of expensive-to-collect training examples. In addition, we develop efficient adaptive sampling methods for generating a comprehensive and diverse sequence of continuous candidate control parameter values (such as pouring waypoints for a cup) during planning. These values become end-effector goals for traditional motion planners that then solve for a full robot motion that performs the skill. By using learning and planning methods in conjunction, we take advantage of the strengths of each and plan for a wide variety of complex dynamic manipulation tasks. We demonstrate our approach in an integrated system, combining traditional robotics primitives with our newly learned models using an efficient robot task and motion planner. We evaluate our approach both in simulation and in the real world through measuring the quality of the selected primitive actions. Finally, we apply our integrated system to a variety of long-horizon simulated and real-world manipulation problems.
more » « less
Full Text Available
Integrated Task and Motion Planning

https://doi.org/10.1146/annurev-control-091420-084139

Garrett, Caelan Reed; Chitnis, Rohan; Holladay, Rachel; Kim, Beomjoon; Silver, Tom; Kaelbling, Leslie Pack; Lozano-Pérez, Tomás (May 2021, Annual Review of Control, Robotics, and Autonomous Systems)

The problem of planning for a robot that operates in environments containing a large number of objects, taking actions to move itself through the world as well as to change the state of the objects, is known as task and motion planning (TAMP). TAMP problems contain elements of discrete task planning, discrete–continuous mathematical programming, and continuous motion planning and thus cannot be effectively addressed by any of these fields directly. In this article, we define a class of TAMP problems and survey algorithms for solving them, characterizing the solution methods in terms of their strategies for solving the continuous-space subproblems and their techniques for integrating the discrete and continuous components of the search.
more » « less
Full Text Available

Search for: All records